Author: RAMU MEDA
Date:
·
Internationalization
allows software to be adapted to any language and cultural convention.
·
During
the internationalization process, the programmer isolates the parts of a
program that are dependent on language and culture
·
Abbreviated
as i18n, because there are 18 letters between the first "i"
and the last "n."
·
Localization
is the process of adapting a program for use in a specific locale.
·
Localization
includes the translation of text such as GUI labels, error messages, and online
help.
·
It also includes the culture-specific
formatting of data items such as monetary values, times, dates, and numbers.
·
Often
abbreviated as l10n, because there are 10 letters between the "l"
and the "n."
Types of data that vary with
region or language:
ResourceBundle
are String
objects. However, not all String
objects are locale-specific. For example, if a String
is a protocol
element used by interprocess communication, it doesn't need
to be localized, because the end users never see itCharacteristics of
internationalized program:
Locale Object:
§
A Locale
object represents a
specific geographical, political, or cultural region.
§
An operation that requires a Locale
to perform its task
is called locale-sensitive and uses the Locale
to tailor information for the user. For
example, displaying a number is a locale-sensitive operation--the number should
be formatted according to the customs/conventions of the user's native country,
region, or culture.
§
If you
intend to create international Java applications, you'll definitely use the java.util.Locale
class.
There's no getting around it
§
You
create a Locale
object using one of the two constructors in this class:
o
Locale(String
language, String country)
o
Locale(String language, String country, String
variant)
§
The
country and variant codes are optional. When omitting the country code,
you specify a null String
.
§
Although
the Locale
constructor allows lowercase letters, it promptly converts the code to
uppercases to create the correct internal representation
§
The
first argument to both constructors is a valid ISO Language Code.
These codes are the lower-case two-letter codes as defined by ISO-639.
§
The
second argument to both constructors is a valid ISO Country Code.
These codes are the upper-case two-letter codes as defined by ISO-3166.
§
The
second constructor requires a third argument--the Variant. The
Variant codes are vendor and browser-specific.
§
Because
a Locale
object
is just an identifier for a region, no validity check
is performed when you construct a Locale
.
§
If you
want to see whether particular resources are available for the Locale
you construct, you
must query those resources. For example, ask the NumberFormat
for the locales it supports using
its getAvailableLocales
method.
§
Note: When you ask for a resource for a
particular locale, you get back the best available match, not necessarily precisely
what you asked for.
§
The Locale
class provides a number of convenient constants that you can use to create Locale
objects for commonly
used locales. For example, the following creates a Locale
object for the
§
Once
you've created a Locale
you can query it for information about itself.
o
Use getCountry
to get the ISO Country Code and getLanguage
to get the ISO Language Code.
o
You
can use getDisplayCountry
to get the name of the country suitable for displaying to the user. Similarly,
you can use getDisplayLanguage
to get the name of the language suitable for displaying to the user.
o
Interestingly, the getDisplayXXX
methods are themselves locale-sensitive and have two versions: one that uses
the default locale and one that uses the locale specified as an argument.
§
The
Java 2 platform provides a number of classes that perform locale-sensitive
operations.
o
the NumberFormat
class formats
numbers, currency, or percentages in a locale-sensitive manner.
§
NumberFormat.getInstance()
§
NumberFormat.getCurrencyInstance()
§
NumberFormat.getPercentInstance()
o
These
methods have two variants; one with an explicit locale and one without; the
latter using the default locale.
§
A Locale
is the mechanism for
identifying the kind of object (NumberFormat
)
that you would like to get. The locale is just a mechanism for
identifying objects, not a container for the objects
themselves.
§
A variant
is an optional extension to a Locale.
Usually you specify variant codes to identify differences caused by the computing
platform
§
The variant
codes conform to no standard. They are arbitrary and specific to your
application.
§
Locale-sensitive
classes support only certain Locale
definitions
§
Although
the Java compiler and run-time environment won't complain if you make up your
own language and country identifiers, you should use the valid codes defined by
ISO standards
§
When
the Java1 Virtual Machine (JVM) starts up, it
queries the underlying OS for a default-locale setting. You can discover your
default locale programmatically.
§
In a
Java application, each locale-sensitive object is responsible for its own
locale-dependent behavior. A Locale
object doesn't enforce this behavior; it simply acts as an indicator to other
objects. Those objects are then responsible for using the Locale
appropriately.
§
By
design, locale-sensitive classes are independent of each other.
That is, the set of supported Locale
s
in one class does not need to be the same as the set in another class.
§
A Java
application can have multiple locales active at the same time. That is,
it's possible to use a French date format and a Locale
to every
locale-sensitive object in your program. This flexibility allows you to develop
multilingual applications, which can display information in
multiple languages.
§
Scope
of a Locale: On the
Java platform you do not specify a global Locale
by setting an environment variable before running the application. Instead you
either rely on the default Locale or assign a Locale
to each locale-sensitive object.
Resource Bundle:
·
Resource
bundles contain locale-specific objects. When program needs a
locale-specific resource, a String
for example, your program can load it from the resource bundle that is
appropriate for the current user's locale.
·
A ResourceBundle
is an
example of a locale-sensitive object.
·
This
allows you to write programs that can:
§
be
easily localized, or translated, into different languages
§
handle
multiple locales at once
§
be
easily modified later to support even more locales
·
One
resource bundle is, conceptually, a set of related classes that inherit from Resource Bundle
. Each
related subclass of Resource
Bundle
has the same base name plus an additional component that
identifies its locale.
·
Each
related subclass of Resource
Bundle
contains the same items, but the items have been
translated for the locale represented by that ResourceBundle
subclass.
·
In
general, the objects stored in a ResourceBundle
are predefined and ship with the product. These objects are not modified while
the program is running
·
When
your program needs a locale-specific object, it loads the ResourceBundle
class using
the getBundle
method:
§
ResourceBundle my Resources =
ResourceBundle.getBundle("MyResources", currentLocale);t
§
the
first argument specifies the family name of the resource bundle that contains
the object in question. The second argument indicates the desired locale.getBundle
uses these two
arguments to construct the name of the ResourceBundle
subclass it should load as follows.
·
The
resource bundle lookup searches for classes with various suffixes on the basis
of
§
the
desired locale and
§
the
current default locale as returned by Locale.getDefault(), and
§
the
root resource bundle (baseclass),
In the
following order from lower-level (more specific) to parent-level (less
specific):
baseclass
+ "_" + language1 + "_" + country1 + "_" +
variant1
baseclass + "_" + language1 + "_" + country1 +
"_" + variant1 + ".properties"
baseclass + "_" + language1 + "_" + country1
baseclass + "_" + language1 + "_" + country1 +
".properties"
baseclass + "_" + language1
baseclass + "_" + language1 + ".properties"
baseclass + "_" + language2 + "_" + country2 +
"_" + variant2
baseclass + "_" + language2 + "_" + country2 +
"_" + variant2 + ".properties"
baseclass + "_" + language2 + "_" + country2
baseclass + "_" + language2 + "_" + country2 +
".properties"
baseclass + "_" + language2
baseclass + "_" + language2 + ".properties"
baseclass
baseclass + ".properties"
§
The
baseclass must be fully qualified (for example, myPackage.MyResources
, not
just MyResources
).
It must also be accessible by your code; it cannot be a class that is
private to the package where ResourceBundle.getBundle
is called.
§
Resource
bundles contain key/value pairs. The keys uniquely identify a
locale-specific object in the bundle. Here's an example of a ListResourceBundle
that
contains two key/value pairs:
class MyResource extends ListResourceBundle {
public Object[][] getContents() {
return contents;
}
static final Object[][] contents = {
// LOCALIZE THIS
{"OkKey", "OK"},
{"CancelKey", "Cancel"},
// END OF MATERIAL TO LOCALIZE
};
}
String
s. In this example, the keys are OkKey
and CancelKey
. In the
above example, the values are also String
s--OK
and Cancel
--but they don't
have to be. The values can be any type of object. §
You
retrieve an object from resource bundle using the appropriate getter
method.: button1 = new
Button(myResourceBundle.getString("OkKey"));
§
The
getter methods all require the key as an argument and return the object if
found. If the object is not found, the getter method throws a MissingResourceException
.
§
Besides
getString
;
ResourceBundle supports a number of other methods for getting different types
of objects such as getStringArray
.
If you don't have an object that matches one of these methods, you can use getObject
and cast
the result to the appropriate type.
§
You
should always supply a baseclass with no suffixes. This will be
the class of "last resort", if a locale is requested
that does not exist. In fact, you must provide all of the classes in any
given inheritance chain that you provide a resource for. For example,
if you provide MyResources_fr_BE,
you must provide both MyResources
and MyResources_fr
or the resource bundle lookup won't work right.
§
The
Java 2 platform provides two subclasses of ResourceBundle
,
ListResourceBundle
and PropertyResourceBundle
,
that provide a fairly simple way to create resources. ListResourceBundle
manages
its resource as a List of key/value pairs.
§
PropertyResourceBundle
uses a properties file to manage its
resources.
§
If ListResourceBundle
or PropertyResourceBundle
do
not suit your needs, you can write your own ResourceBundle
subclass. Your subclasses must
override two methods: handleGetObject
and getKeys()
.
§
The
keys must be String
objects in ListResourceBundle Object. The keys as well as key values must be
string objects in PropertyResourceBundle Object.
§
You
can organize your ResourceBundle
objects according to the category of objects they contain. For example,
you might want to load all of the GUI labels for an order entry window into a ResourceBundle
called OrderLabelsBundle
..
o
Advantages:
Easier to read & maintain; load into memory fast; reduce memory usuage by
loading the required bundle.
InputStreamReader
§
An
InputStreamReader is a bridge from byte streams to character streams:
It reads bytes and decodes them into characters using a specified charset
.
§
The charset
that it uses may be specified by name or may be given explicitly, or the
platform's default charset may be accepted.
§
Each invocation of one of an
InputStreamReader's read () methods may cause one or more bytes to be read from
the underlying byte-input stream.
§
To
enable the efficient conversion of bytes to characters, more
bytes may be read ahead from the underlying stream than are necessary to
satisfy the current read operation.
§
For
top efficiency, consider wrapping an InputStreamReader within a BufferedReader.
For example: BufferedReader in = new BufferedReader(new
InputStreamReader(System.in));
OutputStreamWriter
o
An
OutputStreamWriter is a bridge from character streams to byte streams:
o
Characters written to it are encoded into
bytes using a specified charset
. T
o
he
charset that it uses may be specified by name or may be given explicitly, or
the platform's default charset may be accepted.
o
Each
invocation of a write () method causes the encoding converter to
be invoked on the given character(s). The resulting bytes are accumulated in a
buffer before being written to the underlying output stream. The size of this
buffer may be specified, but by default it is large enough for most purposes.
Note that the characters passed to the write() methods are not buffered.
o
For
top efficiency, consider wrapping an OutputStreamWriter within a BufferedWriter
so as to avoid frequent converter invocations. For example:
o
Writer
out = new BufferedWriter(new
OutputStreamWriter(System.out));
o
A surrogate
pair is a character represented by a sequence of two char values: A high
surrogate in the range '\uD800' to '\uDBFF' followed by a low surrogate
in the range '\uDC00' to '\uDFFF'. If the character represented by a surrogate
pair cannot be encoded by a given charset then a charset-dependent substitution
sequence is written to the output stream.
o
A malformed
surrogate element is a high surrogate that is not followed by a low
surrogate or a low surrogate that is not preceeded by a high surrogate.
o
It is
illegal to attempt to write a character stream containing malformed surrogate
elements. The behavior of an instance of this class when a malformed surrogate
element is written is not specified.
Properties:
·
The Properties
class represents
a persistent set of properties.
·
The Properties
can be saved to
a stream or loaded from a stream.
·
Each
key and its corresponding value in the property list is a string.
·
Properties
file stores information about the characteristics of a program or
environment including internationalization/localization information.
·
A
properties file is in plain-text format
·
These keys
must not change, because they will be referenced when your program fetches the
translated text
·
A
property list can contain another property list as its "defaults";
this second property list is searched if the property key is not found in the
original property list.
·
Because
Properties
inherits from Hashtable
,
the put
and putAll
methods can be applied to a Properties
object. Their use is strongly discouraged as they allow the caller to insert
entries whose keys or values are not Strings
.
The setProperty
method should be used instead.
·
If the store
or save
method is
called on a "compromised" Properties
object that contains a non-String
key or value, the call will fail.
·
When saving
properties to a stream or loading them from a stream, the ISO 8859-1 character
encoding is used. For characters that cannot be directly represented in
this encoding, Unicode
escapes are used; however, only a single 'u' character is allowed in
an escape sequence.
·
The native2ascii
tool can be used to convert property files to and from other character
encodings.
·
By
creating a Properties object and using the load method a program can read a
properties file. The program can then access the values by using the key as
follows:
o
Properties
props = new Properties();
o
props.load(new
BufferedInputStream(new FileInputStream("filename");
o
String
value = System.getProperty("key");
·
Alternatively
properties can be specified on the command line at application startup time,
e.g. java
-Dmy.property=value MyApplication
·
If the
key is not found getProperty returns null.
·
PropertyResourceBundle
is
backed up by a set of properties files.
ListResourceBundle
is backed by a class file
·
Provides
classes and interfaces for handling text, dates, numbers, and messages in
a manner independent of natural languages. This means your main application or
applet can be written to be language-independent, and it can rely upon
separate, dynamically-linked localized resources. This allows the flexibility
of adding localizations for new localizations at any time.
·
All
classes in the java. text package are Locale sensitive
·
These
classes are capable of
o
formatting
dates, numbers, and messages, parsing;
o
searching
and sorting strings;
o
Iterating
over characters, words, sentences, and line breaks.
·
This
package contains three main groups of classes and interfaces:
o
Classes
for iteration over text
o
Classes
for formatting and parsing
o
Classes
for string collation
·
A CollationKey
represents a String
under the rules of a specific Collator
object.
·
The Collator
class
performs locale-sensitive String
comparison
·
An Annotation
object is used as a wrapper for a text attribute value if
the attribute has annotation characteristics.
·
Use
the BreakIterator
class only with natural-language text. To tokenize a programming language, use
the StreamTokenizer
class.
·
·
Unicode
is an international effort to provide a single character set that
everyone can use.
·
Java
uses the Unicode 2.0
(or 2.1) character encoding standard.
·
In the
Java programming language char
values represent Unicode characters. Unicode is a 16-bit
character encoding that supports the world's major languages
·
In Unicode, every character occupies two
bytes. Ranges of character encodings represent different writing
systems or other special symbols. For example, Unicode characters in the range 0x0000
through 0x007F
represent the basic
Latin alphabet, and characters in the range 0xAC00 through 0x9FFF represent the
Han characters used in
·
UTF is a multibyte encoding format,
which stores some characters as one byte and others as two or three bytes. If
most of your data is ASCII characters, it is more compact than Unicode, but in
the worst case, a UTF string can be 50 percent larger than the corresponding
Unicode string. Overall, it is fairly efficient.
·
Despite
the advantages of Unicode, there are some drawbacks: Unicode support is
limited on many platforms because of the lack of fonts capable of
displaying all the Unicode characters.
·
UTF-8
stands for Universal Transformation Format, 8-bit encoding
form. It is a transmission format for Unicode that is suitable for use
with many network protocols and UNIX file systems.
·
An
Annotation object is used as a wrapper for a text attribute value
if the attribute has annotation characteristics. These characteristics are:
o
The text
range that the attribute is applied to is critical to the semantics of
the range. That means, the attribute cannot be applied to subranges of
the text range that it applies to, and, if two adjacent text
ranges have the same value for this attribute, the attribute still
cannot be applied to the combined range as a whole with this value.
o
The
attribute or its value usually no longer applies if the underlying text
is changed.
·
A
CollationKey represents a String under the rules of a specific Collator
object.
·
Comparing
two CollationKeys returns the relative order of the Strings they
represent.
·
Using
CollationKeys to compare Strings is generally faster than using
Collator.compare. Thus, when the Strings must be compared multiple times,
for example when sorting a list of Strings. It's more efficient
to use CollationKeys.
·
You
can not create CollationKeys directly. Rather, generate them by calling Collator.getCollationKey.
·
You can
only compare CollationKeys generated from the same Collator object.
·
Generating
a CollationKey for a String involves examining the entire String
and converting it to series of bits that can be compared bitwise.
This allows fast comparisons once the keys are generated.
·
The
cost of generating keys is recouped in faster comparisons when Strings need to
be compared many times.
·
Collator.compare examines only
as many characters as it needs which allows it to be faster when doing single
comparisons.
·
The
Collator class performs locale-sensitive String comparison.
·
Use
this class to build searching and sorting routines for natural
language text.
·
Collator
is an abstract base class. Subclasses implement specific
collation strategies. You can use the static factory method, getInstance,
to obtain the appropriate Collator object for a given locale.
·
The Character
comparison
methods use the Unicode standard to identify character properties.
Character Encoding: A character encoding is a mapping between
characters and code values.
Input method:
·
Lets
users enter thousands of different characters using keyboards with far fewer
keys.
·
the
user may have input methods for different languages or input methods that
accept various types of input
·
Input
method framework:
enables all text editing components to receive Japanese, Chinese, or Korean
text input through input methods.
Scenario |
Solution |
You need to find a localized
value for a given key, for example, an error message |
Use java.util.Properties to
load values from a stream(e.g. a java.io.FileInputStream) and then use
a singlelookup key to obtain a localized value |
You need to format and present
numbers and currencies. |
Use java.text.NumberFormat. |
You need to format and present
dates and times |
Use java.text.DateFormat. |
You need to order and handle
text data. |
Use Collator and CollationKey
for ordering and MessageFormat, ResourceBundle, orPropertyResourceBundle
to handle text. |
You need to read and write
files. |
Use InputStreamReader for reading
and OutputStreamWriter for writing. |
You need to create localized
JSPs. |
Use Locale, contentType,
and pageEncoding attributes. You need to create
localized servlets. Use Locale and ServletResponse.setContentType() and ServletResponse.setLocale()
methods |
You are developing an application that
will only execute in a single and very narrow geographic location. |
There is no need to develop the
application Using Javas internationalization
feature. |
You are creating an application for a
company with offices in several countries and time zones. Where possible, the
application needs to adapt its functionality and presentation to local
customs and language. |
Use Javas internationalization feature
to develop this application. |
Converting byte stream to character
stream (or) locale sp encoding to Unicode |
InputStreamReader |
Converting Character streams to Byte
Streams (or) Unicode to regional specific encoding |
OutputStreamWriter |
Locale independent string/character comaprisions/sort |
Use Collator Object |
For repeated searching and sorting of strings |
Use Collation Key Class |
To Isolate localizable elements from the
rest of the application. |
ResourceBundle Object |
contains |
Use PropertyResourceBundle object |
format a compound message
in a locale-independent manner |
construct
a pattern that you apply to a |
To detect character, word, sentence and
line boundaries |
BreakIterator
Class |
java.text.NumberFormat
java.text.DecimalFormat
java.text.DateFormat
java.text.SimpleDateFormat
java.text.MessageFormat
java.text.BreakIterator